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Abstract 

We analyze the achievable communication rates of a generalized soliton-based transmission system for the 
optical fiber channel. This method is based on modulation of parameters of the scattering domain, via the inverse 
scattering transform, by the information bits. The decoder uses the direct spectral transform to estimate these 
parameters and decode the information message. Unlike ordinary On-Off Keying (OOK) soliton systems, the 
solitons' amplitude may take values in a continuous interval. A considerable rate gain is shown in the case where 
the waveforms are 2-bound soliton states. Using traditional information theory and inverse scattering perturbation 
theory, we analyze the influence of the amplitude fluctuations as well as soliton arrival time jitter, on the achievable 
rates. Using this approach we show that the time of arrival jitter (Gordon-Haus) limits the information rate in a 
continuous manner, as opposed to a strict threshold in OOK systems. 



I. Introduction 

Communication through optical fiber channels has evolved enormously in the past couple of decades 
leading to unprecedented information rates. Current information theoretic techniques are unsuccessful in 
producing relevant methods to predict capacity bounds for these channels. 

The nonlinear terms that affect signal evolution led to the following question: Is the information capacity 
of the optical fiber channel monotonically increasing with the input power and if so does the capacity 
grow logarithmically with power as it does for linear channels?. Moreover, as the complexity allowed 
in receivers grows, one looks for insights regarding the best (not necessarily the simplest) modulation 
schemes, signal space and error correcting codes. 

The basic generic partial differential equation (PDE) that describes the value of the electric field in 
space and time (in one dimension) in the optical fiber channel is (using normalized coordinates and the 
notations of IfTI): 

^^+2^ + 1^1^ = (1) 

where the input of the channel is q{0,t) and the output is q(L,t). This equation is also known as the 
non-linear scalar Schrodinger (NLS) equation. 

Since the equivalent channel is nonlinear, a Fourier frequency based analysis is not applicable. The 
usual way to analyze a continuous-time channel in traditional information theoretic methods is to reduce 
the problem into a discrete one by considering the Nyquist samples of the input and output. However, 
since a bandlimited input signal evolves into an output signal of an infinite bandwidth, it is hard to 
find such discrete-time models. We stress that the nonlinearity invoked by the channel is fundamental 
and is conceptually different than nonlinearities caused by transmitter/reciever elements, e.g., amplifier 
nonlinearities, that have been studied in the past. 

A different approach to analyzing signal evolution in nonlinear channels is the inverse scattering 
transform (1ST). In this paper we present this method and apply it to a few tractable problems in which 
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we approximate the achievable data rates. We also explain how this method should be developed to 
characterize the channel capacity and useful modulation schemes. A similar approach, first proposed by 
Hasegawa and Nyu ([2J), suggested using multiple solitonic waveforms. It should be noted that the 1ST 
approach presented in this paper is not complete in the following aspects: 

• It does not provide single letter results for capacity but rather a new method to evaluate it which we 
feel is more esthetic and better suited for this channel. 

• It does not solve the problems associated with the bounded symbol rate for solitonic waveforms 
which is characterized by the Gordon-Haus bound ([3]). 

• It lacks a simple representation of the manner in which white noise is projected onto complex solitonic 
waveforms. 

We now give a short introduction to the inverse scattering transform which solves a set of nonlinear 
evolution problems via the solution of three linear problems. A recent more complete introduction to the 
1ST and its properties can be found in 

II. A PRIMER ON THE INVERSE SCATTERING TRANSFORM 

The inverse scattering method does not consist of a single generic transform. In fact, it is more like a 
recipe for solving a family of nonlinear evolution problems. This recipe involves finding two g-dependent 
operators, L and M, that obey certain conditions. The first operator of the two defines an eigenvalue 
problem for an auxiliary wave function. This problem gives rise to solutions that obey boundary conditions 
at — oo and oo. The way these solutions evolve from — oo to oo defines the scattering coefficients or the 
scattering data which is analogous to spectral content in the Fourier frequency domain for linear channel 
problems. Extracting the scattering data from the q dependent operator is called the direct transform. Due 
to special properties of the above operators the evolution of the scattering data in time is rather simple. 
Moreover, there is a well defined inverse transform that maps the scattering data back to q. All of the 
above steps, direct transform, inverse transform and time evolution are essentially linear problems. We 
now present the details of the 1ST for NLS. 

To solve integrable systems such as the NLS one needs to express the system as a compatibility condition 
of two linear equations for a wave equation, "^{T, Z-X): 

L{Z)^ = (2) 

where L and M are differential operators in the T-derivatives and are called a Lax pair if: 

dL 



^^^^ ML-LM=[M,L]. (4) 

The right hand side is called the commutator of M and L. If dH) holds then one can show that the 
eigenvalues of the operator L are Z-invariant: 

dC/dZ = 0, 

even though L is not Z-invariant. 

Finding a Lax pair for a given channel is not an obvious task. The Lax pair for the NLS, found by 
Zacharov and Shabat, is given by: 



igj, q 
-a* -iA. 



(5) 
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It is readily verified tliat for these operators, equation Q results in the NLS equation. To solve equation 
[21 we define vector wave functions for real ^ = C with asymptotic boundary conditions: 

HT;0 ^ (J)^"^'' T^-oo (7) 

vl/(T;0 ^ T->^. (8) 

The pair \& = {-02, —0*} is a complete system of solutions for Q. Therefore: 

$(T,0 = a(0^ + &(0^- (9) 

For T — )• oo we have: 

$(T, ^ a(0 ( I ) e-«^ + 6(0 ( J ) e^^^. (10) 

Comparing with equation ^ we recognize ^iid 6(^)/a(^) as the transmission and reflection 

coefficients which characterize the scattering data. The origin of these names is in the fact that they 
describe what happens to a wave as it evolves from — oo to oo and scatters due to a certain "potential", 
q (these terms are borrowed from quantum physics). 

The discrete eigenvalues of the direct scattering problem are the set of points: 

C = {Cn,n = l,2,...iV;Im(C)>0 s.t. a(C) = 0} (11) 

for which: 

^T;Cn) = bn^{T;Cn). (12) 



Equation (1121) shows that both \E' and $ approach zero as T approaches infinity. The scattering data, which 
has a one-to-one correspondence with q and hence carries the same information is comprised of: 

= 0) = [r(e; 0) = for real ^, ^(0)} for n = l,2,,,N], (13) 

where: 

C„(0) = b40)/alX0) al(0) = ^(T = 0; (n) (14) 

are called the norming constants of the bound states. 

The time evolution of the scattering data is governed by The solution of which (see [HI) is: 

r{^-Z) = r(e;0)e-2«'^ (15) 
Cn{Cn,Z) = C„(C„;0)e-*2c^^ (16^ 

UZ) = Cn(0). (17) 

The inverse problem of finding q given the scattering data is solved by a set of linear integral equations 
which are beyond the scope of this introduction. 

The 1ST is important because it allows the use of linear techniques to solve initial value problems for 
nonlinear problems. The main advantages of the 1ST is that the number of degrees of freedom that a 
signal is comprised of, i.e. number of solitons and radiation bandwidth, does not change through signal 
evolution and that there are natural invariant-over-time scalar entities, i.e. eigenvalues. The evolution of 
the solution in time is most naturally described through the 1ST and thus the 1ST may lead us to insights 
regarding communication strategies. For an in-depth survey of the 1ST also known as the nonlinear Fourier 
transform, and an OFDM-like communication transmission method, see the paper by Yousefi et al. ([[U, 

lEl). 
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Actually, Hasegawa and Nyu (see [|2]|, [[T]|)proposed a communication method that utilizes the fact that 
the eigenvalues associated with the 1ST do not change in time. The advantages of the method proposed by 
Hasegawa et al. is that it is inherently multi-valued and is similar to frequency based methods for linear 
channels. The authors do not analyze the effects of amplifier noise on the eigenvalues and its implications 
on channel capacity. In the following we elaborate on the ideas of eigenvalue communications, extend it, 
and use results from perturbation theory (see for example ^) for nonlinear models to estimate the 
capacity of nonlinear channels. We extend the idea of eigenvalue communication to that of spectral data 
modulation and use the inverse scattering transform as our transmitter and the direct spectral transform 
in the receiver. We quantify the effects of amplitude fluctuations and jitter on achievable communication 
rates and evaluate them for realistic configurations. 

III. Carrying information using the scattering data 
We assume that the channel model is represented by: 

dq 1 d'^q , ,9 „ 

where eR is the perturbation term. Throughout this paper we assume that i? is a white noise Gaussian 
process (in space and time) with a unit power spectral density (PSD) and e is used as a scaling parameter 
for the noise power that can be related to the physical parameters of the channel. We will later plug-in 
these parameters to obtain practical results. The noise is generated by the effects of amplifiers that are 
spread throughout the fiber but we assume it is injected adiabatically [J . 

The information rate, Rb, that can be achieved on this channel is upper bounded by the channel capacity 
which is the maximal mutual information between the channel's input and output 

Rb<maxI{q{0,Ty,q{L,T)). (19) 

where the maximization is taken over some input constraint (e.g. an average power constraint, a peak 
power constraint, Fourier bandwidth or maximal number of solitons). Evaluating the quantity above turns 
out to be a very difficult task for nonlinear channels. In this paper we argue that the most tractable way of 
evaluating this quantity is through the statistics of the scattering data of the 1ST, namely the eigenvalues 
and the absolute value of the norming constants. 

Since the 1ST is a one-to-one transformation the mutual information between the waveforms is equivalent 
to the mutual information between the scattering data, i.e., 

/(g(0, T); q{L, T)) = = 0); S(Z = L)). (20) 

To lower bound this quantity one can assume that the input is a reflectionless potential so that the 
information transmitted solely through the discrete eigenvalues and corresponding norming constants, i.e., 

= 0); S(Z = L)) > /({Cn(0), C„(0)}; {C„(L), C„(L)}) 
for n = 1, 2, , , cxD, 

where the time index is added since the Gaussian noise changes the eigenvalues (that are otherwise 
constant) and can also possibly change their number via the birth/death of a soliton. 

The observation that the mutual information in a nonlinear integrable channel can and should be 
evaluated through the statistics of the scattering data is the main observation in this paper. This approach is 
motivated by several reasons. First, unlike the linear spectral domain (i.e., Fourier methods where spectral 
broadening is a result of the nonlinearity) the number of degrees of freedom in the scattering domain 
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remains unchanged throughout the noiseless evolution. Second, the eigenvalues and norming constants 
serve as scalar candidates for the transmission of information implying a new notion of a nonlinear signal 
space. The evaluation of equation (|2TI) is still a cumbersome task, yet it can be approximated assuming 
some further restrictions on the input signals. 

IV. Main Results 

In the generalized soliton transmission system we analyze, a codeword is a (large) set of symbols. 
Each symbol is in fact a set of eigenvalues and norming constants. At the transmitter, the waveform to 
be transmitted is generated using the inverse scattering transform. At the receiver, direct scattering is 
applied to derive the set of (perturbed) eigenvalues and norming constants. The waveforms used by the 
transmitter have infinite support but decay exponentially so that if we truncate the waveforms to create a 
finite symbol period at a suitable distance we can treat the resulting soliton interaction as being negligible 
to the added noise. 

Throughout this Section the imaginary parts of the eigenvalues, which can be considered to be gener- 
alized amplitudes, will be the information carrying agents. 
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Fig. 1. The IST-based communication sclieme 



A. Information embedded in a single soliton 

In this setting single solitons are modulated. Unlike ordinary OOK their amplitudes belong to a 
continuous interval. Without a perturbation, the single soliton solution for the NLS is 

g(T, Z) = risech[ri{T + kZ - Tq)] exp ^-mT + Z + iao^ , (21) 

for which the corresponding discrete eigenvalue of the 1ST is = (k + ir])/2. For the rest of the paper 
we assume all eigenvalues are purely imaginary (except for perturbations). The localization of the soliton 
is around |To| = e''(°)^ 

We use results from ^ for the first order perturbations of the eigenvalues. The resulting fluctuation in 
the amplitude is: 

— ^ = e / 3fJ(/?exp-*'^)sechrrfr (22) 

where r = r]{T-To). 
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Assuming i? is a bandlimited white Gaussian noise, i.e. < R(t, z),R{r, w) >= 6{t — t)6{z — w), we 
get: 

EMO) - r]{Z)f] = e'r]Z (23) 

i.e., the variance of the additive noise is proportional to i] (unlike ordinary multiplicative noise for which 
the variance is proportional to i]"^). 

Thus, assuming information is transmitted in the amplitude of a single is soliton (T = 0) we have the 
following scalar channel: 

Y = 253(C(^)) = v{Z) =V + VvN (24) 

where is a Gaussian r.v. with zero mean and a variance of e^Z. We dismiss the probability that the 
soliton vanishes completely and allow for Y to be theoretically zero (or negative). This scenario can 
be prevented (with high probability) by using ^/t] >> e\/Z which in the limit of e going to zero has 
negligible effect on the capacity. We lower bound the mutual information for the case r] E [i]min, Vmax] 
with Ai] = r]max — Vrnin- It is assumcd that the noise is Gaussian and of the the largest possible variance: 



/ > h(Y)-h(Y\r]) (25) 

> h{r]) ~ h{Y\7]) (26) 

> h{ri) - -In 271 er]maxe'^Z (27) 

At? 

= log bits/soliton (28) 

^/^Ter]max(^'^Z 



where we use the uniform distribution as the input prior and bound (|T7l) using the fact that Gaussian noise 
has the highest entropy for a given variance. We refer to this quantity as the "soliton spectral efficiency" 
which can be considered to be the NLS analog of spectral efficiency in conventional (linear) channels 
where it's measured in bits/Hertz. 

The capacity can also be directly evaluated using the Blahut-Arimoto algorithm (dH, [[TOl ). Using this 
algorithm for the channel model Y = r] + ^JrjN restricted s.t. rj G [1,2] and E{N'^) = 0.1^ we get 
that the true capacity is 1.568 bits per channel use while our bounds reads 1.275 bits per channel use. 
The capacity achieving prior and the resultant Y distribution are plotted in Figures [2] and [3l Note that 
the capacity achieving prior has both atoms and a continuous distribution which is typical of interval 
constrained capacity problems ( IfTTTl ). 
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Fig. 2. Tj's distribution for the square root multiplicative channel 
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B. Information embedded in a soliton train- below the Gordon-Haus rate 

The above result shows that the interval Ar] = ri^ax — Vmin should be as large as possible to allow for 
each soliton to convey as many bits as possible. In fact when one considers transmitting many solitons 
one after the other, there are other considerations which bound the optimal interval size from both sides, 
namely intersoliton interaction and arrival time jitter. 

We now consider the case where many solitons are modulated sequentially. The distance between 
neighboring solitons is a multiple of the width of the widest soliton, i.e., where C is chosen so that 
the intersoliton interaction has a negligible (compared to that of the noise) effect on the eigenvalues. The 
distance between solitons is inversely proportional to the symbol rate and thus in an optimal system 'qmin 
is bounded from below. 

Since we wish to assume a perfectly (or at least an almost perfectly) synchronized communication 
system, the typical arrival time jitter needs to be less than the distance between neighboring solitons. The 
time of arrival jitter is known to be directly connected to fluctuations of the real part of the soliton which 
is linearly related to the velocity of the soliton as can be seen from |2T1 The fluctuations of the real part 
of the eigenvalue are very similar to that of the imaginary part: 

E[{k{0) - K{Z)f] = (29) 

Using ^ = —K,{Z) we integrate to account for the arrival time jitter (neglecting terms that do not 
originate from the velocity change): 

EliUO) ~ UZ)f] = (30) 

This is the known Gordon-Haus ((311) phenomena that bounds the symbol rate of all regular soliton systems 
(including OOK). The worst-case arrival time jitter is proportional to r]max- Thus, requiring a (almost) 
jitter free model, e.g., a out-of-synchronization probability of 10^^ bounds from above 'qmax- 

We wish to compare the gain (in terms of bits/second) of the continuous amplitude modulation scheme 
versus that of the OOK modulation. We assume that rjmax is tuned by the Gordon-Haus bound requiring 
no-jitter and is shared by both the continuous system and the on-off reference system. The continuous 
system has a lower symbol rate which is ^^^^ times smaller than that of the reference systenl^. However, 

fjmax ^ 
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the continuous system conveys more bits than just one per soliton. Weighing both terms the continuous 
system has a bit rate which is 

MG = max ■ log —^^=^^= = max ■ log (31) 

Vmin rfmax TT e rimax ^'^ Z Vmin r]max <^eff 

times that of the reference system. We refer to this term as the "Modulation gain". If one would also 
consider the possibility that a symbol can also contain no soliton at all, and if r/mm >> ^J^max^^/Z so 
that the transfer probability between the continuous interval and the zero hypothesis would be less than 
10^3 than the modulation gain would approximately read: 

max max ■ [ iJb(p) + p ■ log — ^ ) = (32) 

max max ■ ( Hb{p) + p ■ log — —j , (33) 

where iJfe(p) is the binary entropy of p (see union of channels in [12]). The modulation gain is plotted 
in Figure |4] for different values of cTgjj. It is evident that as the effective SNR improves a larger rimin is 
better since it does not reduce the symbol rate. 




Fig. 4. Modulation gains as a function of r\min for different erg//. 



C. Information embedded in a 2-bound soliton train- below the Gordon-Haus rate 

The system described above could be analyzed using the framework of perturbations to sech profiles 
without necessarily using the perturbation theory of the inverse scattering transform. However, considering 
more complicated symbols made up of more than one soliton the 1ST has major analytical and practical 
advantages. This is the case when the symbols are confined to be either a 2-soliton bound state or a single 
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soliton (or non). We now analyze the modulation gain of this more complicated system and address such 
issues as common jitter and whether the solitons should be concentric or partially spaced apart. 

The idea of transmitting a few concentric solitons is proposed in the paper by Hasegawa et al. However, 
a 2-bound soliton is effected by noise differently than each one of its components. We show that a 2-bound 
soliton solution has a larger jitter than its components. Therefore there is a tradeoff between the enlarged 
bit rate and a smaller symbol rate that is induced by a larger jitter. 

The basic symbol is now comprised of a 2-bound soliton. This means the transmitter solves the following 
reflectionless algebraic inverse scattering problem for N = 2 (ill): 

fin = \/C~n^{T-Cn) 1 = 1,2 
Fl = (/«!,,,, /w) 
Mnm = ^n(^rnl (Cn Cm) 

en = V^exp«„T) £" = (61,,,, cnY 

The norming constants are used to localize the different eigenf unctions. As a generalization of the 
single soliton case, we choose |6n(0)| = e^^"*"^°^ where t„(0) is the generalized position of the nth 
eigenfunction. Actually, the eigenfunctions interact with one another and the resulting time waveform is 
not a superposition of 2 single soliton profiles. Nevertheless, their generalized position remains unchanged 
throughout the evolution (apart from noise influence) and can be recovered at the receiver. The generalized 
position evolution is given by (to the first order): 

dZ ^ ' 

and thus it behaves in the same way as the center of single soliton. However, the fluctuations of the 
eigenvalues of a 2-bound soliton, both imaginary and real parts are not orthogonal anymore. In fact they 
are highly correlated in the case of a small separation between generalized locations or in the case of very 
similar eigenvalues. Moreover, the variance of the fluctuations is generally magnified when the solitons 
"overlap". This effect makes modulating non-concentric solitons (or actually eigenfunctions) a sensible 
thing to do. We plot the variance of the eigenvalues as a function of the separation between the generalized 
positions in Figure [51 In this setting the detector sees two eigenvalues and two norming constants that 
translate to generalized positions. All of these scalar quantities are now perturbed by noise. Since the two 
eigenfunctions are assumed to be much closer to each other than to allow for neglecting the Gordon-Haus 
jitter, we must account for the way the jitter effects the capacity. 

In linear communication problems a non-negligible jitter in symbol arrival times can diminish the 
achievable rate to zero. This is due to the fact that in a linear channel the signal space is made up of 
translations of a limited number of base functions. Once there is a jitter, these functions are no longer 
orthogonal and one can not differentiate between neighboring symbols. 

However, in a nonlinear integrable system, solitons can be detected through the direct scattering 
transform even if they are one on top of the other. Actually, they can be detected but not differentiated, 
i.e., both will be apparent but the receiver will not know which of the two belongs to the original slot. 

To lower bound the achievable rate of the jitter effected system we assume that once the eigenfunctions 
are detected they are sorted according to time of arrival. This channel is equivalent to transmitting a 
couple of solitons (eigenvalues), adding noise and finally permuting them in the case the switched places. 
We note the perturbed eigenvalues before and after the possible permutation F" and correspondingly 
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Fig. 5. Variance gain (tlirougii Monte-Carlo simulations) due to proximity of eigenfunctions. The eigenvalues are 2 and 1. The first 
eigenfunction has t = while the second's position is changed. 



(n=2 for the 2-bound soli ton case). The permutation, which is a random variable, is noted by vr". The 
information theoretic loss (in bits) due to the jitter is bounded by: 

= h{Y^) - h{w'^) - + /i(w^riO 

< Mn",<K)-Mn"K) 
= ifK|rr,<) 

< ^W) 

For the two soliton case, the permutation R.V. is equivalent to a Bernoulli R.V. where the mix-up 
probability is equal to the probability that the order of the generalized positions is changed. Using the 
assumption that the eigenvalues will approximately fluctuate in the same way as if the solitons were apart 
(and this is not true when they walk-by each other) we can approximate this probability. For the set of 
generalized positions -1,1, this probability is equal to Pmix-up < Pr(AT > 1) where AT ~ A^(0, liZzr^^^). 
If this probability turns out to be Pmix-up = 0.1, which is conventionally thought to be prohibitively large, 
the rate loss is only //^(O.l) ~ 0.5 bits for the 2-soliton symbol and only 0.25 bits per soliton iHf,(p) is 
the Shannon binary entropy function). The main advantage is a major increase in the soliton rate, since 
there are two solitons per symbol. 

Assuming the spacing between solitons of the same symbol is about a/r^mm and that original distance 
between symbols was C/rjmin the soliton rate is increased by a factor of 2 ■ We approximate the 
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mix-up probability to be Pmix-up ~ Q ( ) • Thus for this setting the "modulation gain" compared to 

V ^jitter J 

a simple OOK system is approximately: 

max 2 ■ -7^— max ■ I Hb{p) + p log , == - Hb{p,nix-up)/2 ) . (34) 

P C + a Vmir. r]jnax \ ^/7ier]max(^ Z J 

The modulation gain for a certain set of parameters is shown in Figure [6] . The gain compared to single 
soliton trains is roughly 2 for a wide set of parameters. 




Fig. 6. Modulation gains as a function of r]min for two soliton trains vs. single soliton trains. 



D. Approximating the Information embedded in a soliton train- slightly above Gordon-Haus rate 

The next natural generalization is to consider an N-bound solution that is made up a train of well- 
spaced (spacing relates to the value of the norming constants) eigenfunctions (we assume N to be large, 
i.e. ^5). The analysis of the former subsection is still a good approximation. The difference is that now the 
ambiguity in time of arrival is not bounded to a pair of solitons. Still, if the eigenfunctions are properly 
spaced the entropy of the order-of-arrival sequence, H{tt'^), is mainly to do with the probability that 
consecutive eigenfunctions will change their order of arrival. The information theoretic penalty on the bit 
rate due to this effect is: 
1 

2^ 



-H{Prnix—up^ 1 '^Pmix—upi Pmix—up) ~ Pmix—up ' ^ / Pmix—up- (35) 



Now, assume the spacing between solitons is approximately (much smaller than the one called for 



mm 



by the Gordon-Haus limit) and the total modulation gain in this setting is: 

max max ■ i Hb{p) + p log _ p^.^_^^ . log 1/pmix-up ) • (36) 

P Vrmn Tj^ax V TX 6 Tfrnax Z 
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Again, there is no problem with trains of eigenfunctions with a typical mix-up (between consecutive 
eigenfunctions) probability of 0.1. Moreover now there is a clear tradeoff for eigenfunction spacing. The 
bigger the spacing the smaller the symbol rate. As the spacing becomes smaller the penalty due to jitter 
is larger and so a unique maximum exists. The main disadvantage compared to the previous subsection 
is that the processing now involves a more complicated channel code. The main advantage is a larger 
symbol rate. 

The analysis above neglects a few things: 

1) There is small coupling between amplitude and time-of-arrival fluctuations. A precise analysis should 
only yield a higher rate. 

2) When two solitons pass by each other, their perturbation statistics is changed. In many cases, their 
amplitude fluctuations grow and are now dependent. We ignore the growth in fluctuations since, 
assuming that solitons are not too crowded, the walk-off is time bounded and its effects are negligible. 
Furthermore, the dependency can only increase the rate, only 

3) We ignore the possibility that a soliton will die/be bom. This happens with a small probability and 
we assume that its effect on the achievable rates can also be bounded. 

V. Discussion and further work 

The notion of modulating the "natural" domain of the channel is not new to communication theory. In 
fact, the scheme discussed in this paper can be considered to be the nonlinear analog of OFDM. Both of 
the methods allow for a natural examination of their respective channel capacities. There are two main 
differences between the two methods. The first is that in linear channels the noise projection on different 
modes (spectral bands) is orthogonal while in the nonlinear case the noise projection on different modes 
(solitons) is orthogonal only in some cases (see Figure ). The second is that OFDM is very efficient in 
terms of complexity (through the use of the celebrated FFT and IFFT) while the direct scattering is a 
computationally intensive method. 

Future research directions include: 

1) Find reasonable complexity (preferably analog) methods to carry out the tasks of inverse and 
especially direct scattering in the transmitter and receiver. 

2) Use the approach discussed in the paper with more complex potentials/waveforms (not reflection- 
less) to lower and upper bound the overall capacity (and not just achievable rates). 

3) While the problems above are not related to information theory, there is a totally new and interesting 
information-theoretic problem that relates to communication via the scattered domain. When receiv- 
ing waveforms that are comprised of N-bound solitons or solitons that are co-centric due to jitter 
(and not thru the constructed modulation) one detects a set of scalar values that can be detected but 
not differentiated. Essentially, the transmitter and receiver communicate through the transmission 
of a set, not a sequence, of perturbed scalar values. Clearly, transmitting and receiving a 3-bound 
solitons conveys less information than a sequence of (ordered in time) three solitons. The question 
is how much less? We call this problem: communicating with colorless, but not massless, balls. 
For more on this issue see the work by Meron et al. [fT3l . 
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Abstract 

We analyze the achievable communication rates of a generahzed soliton-based transmission system for the 
optical fiber channel. This method is based on modulation of parameters of the scattering domain, via the inverse 
scattering transform, by the information bits. The decoder uses the direct spectral transform to estimate these 
parameters and decode the information message. Unlike ordinary On-Off keying soliton systems, the sohtons' 
amplitude is allowed to be part of a continuous interval. A considerable rate gain is shown in the case where 
the waveforms are 2-bound sohton states. Using traditional information theory and inverse scattering perturbation 
theory, we analyze the influence of the ampUtude fluctuations as well as soliton arrival time jitter, on the achievable 
rates. Using this approach we show that the time of arrival jitter (Gordon-Haus) hmits the information rate in a 
continuous manner, as opposed to a strict threshold in On-Off keying systems. 



I. Introduction 

Communication through optical fiber channels has evolved enormously in the past couple of decades 
leading to information rates unprecedented in communication problems. Current information theoretic 
techniques failed to produce relevant methods to predict the capacity bounds for these channels. The 
paper by Mitra and Stark ([!]) has drawn attention to the problem of finding the capacity of the optical 
fiber channel. The authors claim that the capacity of wave division multiplexing (WDM) systems is power- 
bounded, i.e., it is not a monotonic increasing function of the input power. This is a surprising result that 
proves that intuition gained from knowledge of linear problems can be misleading. 

The basic generic partial differential equation (PDE) that describes the value of the electric field in 
space and time (in one dimension) in the optical fiber channel is (using normalized coordinates and the 
notations of []): 

dq 1 d'^q , ,9 „ 

Where the input of the channel is q{0,t) and the output is q{L,t). This equation is also known as the 
non-linear scalar Schrodinger (NLS) equation. 

Since the equivalent channel is nonlinear, a frequency based analysis is out of the question. The usual 
way to analyze a continuous-time channel in traditional information theoretic methods is to reduce the 
problem into a discrete one by considering the Nyquist samples of the input and output. However, 
since a bandlimited input signal evolves into an output signal of an infinite bandwidth, it is hard to 
find such a discrete-time model. We stress that the nonlinearity invoked by the channel is fundamental 
and is conceptually different from nonlinearities caused by transmitter/reciever elements, e.g., amplifier 
nonlinearities, that have been studied in the past. 
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A different approach to analyzing signal evolution in nonlinear channels is the inverse scattering 
transform (1ST). The 1ST is important because it allows the use of linear techniques to solve initial 
value problems for nonlinear problems. The main advantages of the 1ST is that the number of degrees 
of freedom that a signal is comprised of, i.e. number of solitons and radiation bandwidth, does not 
change through signal evolution and that are natural invariant-over-time scalar entities, i.e. eigenvalues. 
The evolution of the solution in time is most naturally described through the 1ST and thus the 1ST may 
lead us to insights regarding communication strategies. For an in-depth survey of the 1ST also known as 
the nonlinear Fourier transform, and an OFDM-like communication transmission method, see the paper 
by Yousefi et al. [2], [3] 

Hasegawa and Nyu (see [4], [5])proposed a communication method that utilizes the fact that the 
eigenvalues associated with the 1ST do not change in time. The advantages of the method proposed 
by Hasegawa et al. is that it is inherently multi-valued and is similar to frequency based methods for 
linear channels. The authors do not analyze the effects of amplifier noise on the eigenvalues and its 
implications on channel capacity. In this paper we elaborate on the ideas of eigenvalue communications 
and use results from perturbation theory (see for example [6], [7]) for nonlinear models to estimate the 
capacity of nonlinear channels. We extend the idea of eigenvalue communication to that of spectral data 
modulation and use the inverse scattering transform as out transmitter and the direct spectral transform 
in the receiver. We quantify the effects of amplitude fluctuations and jitter on achievable communication 
rates and evaluate them for realistic configurations. 

II. Setting 

We assume that the channel model is represented by: 

.dq Id'^q , ,n „ 

where eR is the perturbation term. Throughout this paper we assume that this perturbation is white noise 
(in space and time) with a unit power spectral density (PSD) and e is used as a scaling parameter for 
the noise power that can be related to the physical parameters of the channel. We will later plug-in these 
parameters to obtain practical results. The noise is generated by the effects of amplifiers that are spread 
throughout the fiber but we assume it is injected adiabatically ^ . 
The information rate, Rb, that can be achieved on this channel is upper bounded by [8] 

Rb< max I {q{0,ty,q{L,t)). (3) 

where the maximization is taken over some input constraint (e.g. an average power constraint, a peak 
power constraint, fourier bandwidth or maximal number of solitons). Evaluating the quantity above turns 
out to be a very difficult task for nonlinear channels. In this paper we argue that the most tractable way of 
evaluating this quantity is through the statistics of the scattering data of the 1ST, namely the eigenvalues 
and the absolute value of the norming constants. 

Since the 1ST is a one-to-one transformation the mutual information between the waveforms is equivalent 
to the mutual information between the scattering data, i.e., 

/(g(0, t); q{L, t)) = I{E{z = 0); E{z = L)). (4) 

To lower bound this quantity one can assume that the input is a reflectionless potential so that the 
information transmitted solely through the discrete eigenvalues and corresponding norming constants, i.e., 

= 0); S(Z = L)) > /({Cn(0), a(O)}; {C„(L), ^(L)}) 
for n = 1, 2, , , oo, 



'i.e. inflnitesimal noise admitted at every point along the channel 
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where the time index is added since the gaussian noise changes the eigenvalues (that are otherwise constant) 
and can also possibly change their number via the birth/death of a soliton. 

The observation that the mutual information in a nonlinear integrable channel can and should be 
evaluated through the statistics of the scattering data is the main observation in this paper. This approach is 
motivated by several reasons. First, unlike the linear spectral domain (i.e., fourier methods where spectral 
broadening is a result of the nonlinearity) the number of degrees of freedom in the scattering domain 
remains unchanged throughout the noiseless evolution. Second, the eigenvalues and norming constants 
serve as scalar candidates for the transmission of information implying a new notion of a nonlinear signal 
space. The evaluation of 5 is still a cumbersome task, yet it can be approximated assuming some further 
restrictions on the input signals. 

III. Main Results 

In the generalized soliton transmission system we analyze, a codeword is a (large) set of symbols. 
Each symbol is in fact a set eigenvalues and norming constants. At the transmitter, the waveform to 
be transmitted is generated using the inverse scattering transform. At the receiver, direct scattering is 
applied to derive the set of (perturbed) eigenvalues and norming constants. We assume that the waveforms 
that correspond to different "symbols" are separated enough so that both the inherent truncation of the 
waveforms and interaction between neighboring symbols have negligible effects on the eigenvalues and 
will thus be ignored. 

Throughout this Section the imaginary parts of the eigenvalues, which can be considered to be gener- 
alized amplitudes, will be the information carrying agents, while the norming constants will be used to 
convey the localization of the eigenf unctions. 
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Fig. 1. The IST-based communication scheme 



A. Information embedded in a single soliton 

In this setting single (slightly truncated) solitons are modulated. Unlike ordinary On-Off keying their 
amplitudes belong to a continuous interval. Without a perturbation, the single soliton solution for the NLS 

q{T, Z) = 77sech[7?(T + - Tq)] exp (^mT + ~^ Z + iag^ (5) 
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for which the corresponding discrete eigenvalue of the 1ST h C, — {K-\-ir])/2. For the rest of the paper 
we assume all eigenvalues are purely imaginary (except for perturbations). The localization of the soliton 

is around |To| = e^^^^'^. 

Using results ([5]) for the first order perturbations of the eigenvalues: 



The resulting fluctuation in the amplitude is 

drj 



oo 



e I n{Rexp-''^)sechTdT (6) 



oo 



where r = ri{T -Tq). 

Assuming i? is a bandlimited white Gaussian noise, i.e. < R{t, z), R{t,w) >= 6{t — t)S{z — w), we 
get that 

E[{rj{0) - rjiZ))'] ^ e'rjZ (7) 

i.e., the variance of the additive noise is proportional to rj (unlike ordinary multiplicative noise for which 
the variance is proportional to 77^). 

Thus, assuming information is transmitted in the amplitude of a single is soliton (T = 0) we have the 
following scalar channel: 

Y = 2$>(C(Z)) = viZ) =v + VvN (8) 

where E{N'^) = e'^Z. We dismiss the probability that the soliton vanishes completely and allow for Y 
to be theoretically zero (or negative). This scenario can be prevented (with high probability) by using 
^yr] » e\fZ which in the limit of e going to zero has negligible effect on the capacity. We lower bound 
the mutual information for the case 77 e \r\min-, '^max^ with A77 = r\ynax — Vmin- It is assumed that the noise 
is Gaussian and of the the largest possible variance: 

I > h(Y)-h{Y\r]) (9) 

> h(r)) - h{Y\7]) (10) 

> h{r]) - -In 27ierimaxe^Z (11) 

An 

= log bits/soliton (12) 

where we use the uniform distribution as the input prior. We refer to this quantity as the "soliton spectral 
efficiency" which can be considered to be the NLS analog of spectral efficiency in conventional (linear) 
channels where it's measured in bits/Hertz. 

The capacity can also be directly evaluated using the Blahut-Arimoto algorithm ([9], [10]). Using this 
algorithm for the channel model Y = r] + ^/fjN restricted s.t. r] G [1, 2] and E{N'^) = 0.08^ we get that 
the true capacity is 1.79 bits per channel use while our bounds reads 1.097 bits per channel use. The 
capacity achieving prior and the resultant Y distribution are plotted in Figure 2. Note that the capacity 
achieving prior has both atoms and a continuous distribution. 



B. Information embedded in a soliton train- below the Gordon-Haus rate 

The above result shows that the interval Ar] = r]max — Vmin should be as large as possible to allow for 
each soliton to convey as many bits as possible. In fact when one considers transmitting many solitons 
one after the other, there are other considerations which bound the optimal interval size from both sides, 
namely intersoliton interaction and arrival time jitter. 
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Fig. 2. Capacity aciiieving prior and Y's distribution for the square root multiplicative channel 



We now consider the case where many solitons are modulated sequentially. The distance between 
neighboring solitons is a multiple of the width of the widest soliton, i.e., where C is chosen so that 

Vmin 

the intersoliton interaction has a negligible (compared to that of the noise) effect on the eigenvalues (see 
[]). The distance between solitons is inversely proportional to the symbol rate and thus in an optimal 
system r]min is bounded from below. 

Since we wish to assume a perfectly (or at least an almost perfectly) synchronized communication 
system, we need for the typical arrival time jitter to be less than the distance between neighboring 
solitons. The time of arrival jitter is known to be directly connected to fluctuations of the real part of the 
soliton which is linearly related to the velocity of the soliton as can be seen from 5. The fluctuations of 
the real part of the eigenvalue are very similar to that of the imaginary part: 

EMO) - KiZ))'] = (13) 

Using ^ = —K,(Z) we integrate to account for the arrival time jitter (neglecting terms that do not 
originate from the velocity change): 

E[{UO) - UZ)f] = (14) 

This is the known Gordon-Haus ([11]) phenomena that bounds the symbol rate of all regular soliton 
systems (including on-off keying). The worst-case arrival time jitter is proportional to r]max- Thus, requiring 
a (almost) jitter free model, e.g., a out-of-synchronization probability of 10^^ bounds from above r]max- 
We wish to compare the gain (in terms of bits/second) of the continuous amplitude modulation scheme 
versus that of the on-off keying modulation. We assume that r]max is tuned by the Gordon-Haus bound 
requiring no-jitter and is shared by both the continuous system and the on-off reference system. The 
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continuous system has a lower symbol rate which is ^^^^ times smaller than that of the reference system . 
However, the continuous system conveys more bits than just one per soliton. Weighing both terms the 
continuous system has a bit rate which is 

max ■ log — = max ■ log (15) 

times that of the reference system. We refer to this term as the "Modulation gain". If one would also 
consider the possibility that a symbol can also contain no soliton at all, and if r/^in >> \JVmax^^/Z so 
that the transfer probability between the continuous interval and the zero hypothesis would be less than 
10^3 than the modulation gain would approximately read 

max ■ 1 + log —j^^=^^= = max ■ I 1 + log . (16) 

Vniin rfmax \ \/ e rjmax Z j Vmax \ ^eff J 

The modulation gain is plotted in Figure 3 for different values of (Teff. It is evident that as the effective 
SNR improves a larger rfmin is better since it does not reduce the symbol rate. 
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Fig. 3. Modulation gains as a function of 77™,™ for different (Jeff. 



C. Information embedded in a 2-bound soliton train- below the Gordon-Haus rate 

The system described above could be analyzed using the framework of perturbations to sech profiles 
without necessarily using the perturbation theory of the inverse scattering transform. However, considering 
more complicated symbols made up of more than one soliton the 1ST has major analytical and practical 
advantages. This is the case when the symbols are confined to be either a 2-soliton bound state or a single 

^Actually, one can also analyze the case where symbol widths are not constant and are proportional to 
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soliton (or non). We now analyze the modulation gain of this more complicated system and address such 
issues as common jitter and whether the solitons should be concentric or partially spaced apart. 

The idea of transmitting a few concentric solitons is proposed in the paper by Hasegawa et al. However, 
a 2-bound soliton is effected by noise differently than each one of its components. We show that a 2-bound 
soliton solution has a larger jitter than its components. Therefore there is a tradeoff between the enlarged 
bit rate and a smaller symbol rate that is induced by a larger jitter. 

The basic symbol is now comprised of a 2-bound soliton. This means the transmitter solves the following 
reflectionless algebraic inverse scattering problem for = 2 ([5]): 

fin = \/a^(T;Cn) ^ = 1,2 

Fl = (/zi, , , , /iAf) 
Mnm = e„ey(Cn-C) 

en = ^/C^exp{iCnT) E ^ {ei, , , , cn)* 

The norming constants are used to localize the different eigenfunctions. As a generalization of the 
single soliton case, we choose |fen(0)| = e^''"*"*^*^^ where in(0) is the generalized position of the rith 
eigenfunction. Actually, the eigenfunctions interact with one another and the resulting time waveform is 
not a superposition of 2 single soliton profiles. Nevertheless, their generalized position remains unchanged 
throughout the evolution (apart from noise influence) and can be recovered at the receiver. The generalized 
position evolution is given by (to the first order): 

'"<^' - (^) 
^ = 

and thus it behaves in the same way as the center of single soliton. However, the fluctuations of the 
eigenvalues of a 2-bound soliton, both imaginary and real parts are not orthogonal anymore. In fact they 
are highly correlated in the case of a small separation between generalized locations or in the case of very 
similar eigenvalues. Moreover, the variance of the fluctuations is generally magnified when the solitons 
"overlap". This effect makes modulating non-concentric solitons (or actually eigenfunctions) a sensible 
thing to do. We plot the variance of the eigenvalues as a function of the separation between the generalized 
positions in Figure ??. 

In this setting the detector sees two eigenvalues and two norming constants that translate to generalized 
positions. All of these scalar quantities are now perturbed by noise. Since the two eigenfunctions are 
assumed to be much closer to each other than to allow for neglecting the Gordon-Haus jitter, we must 
account for the way the jitter effects the capacity. 

In linear communication problems a non-negligible jitter in symbol arrival times can diminish the 
achievable rate to zero. This is due to the fact that in a linear channel the signal space is made up of 
translations of a limited number of base functions. Once there is a jitter, these functions are no longer 
orthogonal and one can not differentiate between neighboring symbols. 

However, in a nonlinear integrable system, solitons can be detected through the direct scattering 
transform even if they are one on top of the other. Actually, they can be detected but not differentiated, 
i.e., both will be apparent but the receiver will not know which of the two belongs to the original slot. 

To lower bound the achievable rate of the jitter effected system we assume that once the eigenfunctions 
are detected they are sorted according to time of arrival. This channel is equivalent to transmitting a 
couple of solitons (eigenvalues), adding noise and finally permuting them in the case the switched places. 
We note the perturbed eigenvalues before and after the possible permutation and correspondingly 
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(n=2 for the 2-bound soliton case). The permutation, which is a random variable, is noted by tt". The 
information theoretic loss (in bits) due to the jitter is bounded by: 



I{r]^;YI^)-I{r]^;W^) 

= hiYI") - h{W^) - /i(lT|77r) + h{Wi\Vi) 

< - /i(Yi"|?7^) 

< Mn",<l<)-Mn"l<) 

= H{n^\Y,\rj^) 

For the two soliton case, the permutation R.V. is equivalent to a Bernoulli R.V. where the mix-up 
probability is equal to the probability that the order of the generalized positions is changed. Using the 
assumption that the eigenvalues will approximately fluctuate in the same way as if the solitons were apart 
(and this is not true when they walk-by each other) we can approximate this probability. For the set of 
generalized positions -1,1, this probability is equal to Pmix-up < Pr(AT > 1) where AT ~ A^(0, ^^Itn^^y 
If this probability turns out to be Pmix-up = 0.1, which is conventionally thought to be prohibitively large, 
the rate loss is only Hij{0.1) 0.5 bits for the 2-soliton symbol and only 0.25 bits per soliton (Hi,{p) is 
the Shannon binary entropy function). The main advantage is a major increase in the soliton rate, since 
there are two solitons per symbol. 

Assuming the spacing between solitons of the same symbol is about a/rjmin and that original distance 
between symbols was C/rjmin the soliton rate is increased by a factor of 2 ■ We approximate the 

mix-up probability to be Pmix-up ~ Q ^v^f^)- Thus for this setting the "modulation gain" compared to 
a simple "on-off ' system is approximately: 

2 • 77T~ "^^^ ■ ( 1 + log ; = - Hb(pmix-up)/'2 ) . (17) 

O + a >7min 7]max \^ ^/ 71 6 Tj^ax ^ Z J 

The modulation gain for a certain set of parameters is shown in Figure 4 . The gain compared to single 
soliton trains is roughly 2 for a wide set of parameters. 

D. Approximating the Information embedded in a soliton train- slightly above Gordon-Haus rate 

The next natural generalization is to consider an N-bound solution that is made up a train of well- 
spaced (spacing relates to the value of the norming constants) eigenfunctions (we assume N to be large, 
i.e. ^5). The analysis of the former subsection is still a good approximation. The difference is that now the 

ambiguity in time of arrival is not bounded to a pair of solitons. Still, if the eigenfunctions are properly 
spaced the entropy of the order-of-arrival sequence, //(tt"), is mainly to do with the probability that 
consecutive eigenfunctions will change their order of arrival. The information theoretic penalty on the bit 
rate due to this effect is: 

'^H{p^ix—upi 1 '^Pmix—upi Pmix—up) ^ Pmix—up ' ^Og ^/Pmix—up- (1^) 

Now, assume the spacing between solitons is approximately (much smaller than the one called for 
by the Gordon-Haus limit) and the total modulation gain in this setting is: 

C max • ( 1 -Flog , == - Pmix-up ■ log l/Pmix-up ) ■ (19) 

Vmin r]max \ ^J-Kerjmax^ Z I 
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Modulation gains 
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Fig. 4. Modulation gains as a function of rjmin for two soliton trains vs. single soliton trains. 

Again, there is no problem with trains of eigenfunctions with a typical mix-up (between consecutive 
eigenfunctions) probability of 0.1. Moreover now there is a clear tradeoff for eigenfunction spacing. The 
bigger the spacing the smaller the symbol rate. As the spacing becomes smaller the penalty due to jitter 
is larger and so a unique maximum exists. The main disadvantage compared to the previous subsection 
is that the processing now involves a more complicated channel code. The main advantage is a larger 
symbol rate. 

The analysis above neglects a few things: 

1) There is small coupling between amplitude and time-of-arrival fluctuations. A precise analysis should 
only yield a higher rate. 

2) When two solitons walk by each other, their statistics is changed. In many cases, their amplitude 
fluctuations grow and are now dependent. We ignore the growth in fluctuations since, assuming that 
solitons are not too crowded, the walk-off is time bounded and its effects are negligible. Furthermore, 
the dependency can only increase the rate, only 

3) We ignore the possibility that a soliton will die/be born. This happens with a small probability. We 
conjecture that its effect on the achievable rates can also be bounded. 

IV. Discussion and further work 

The notion of modulating the "natural" domain of the channel is not new to communication theory. In 
fact, the scheme discussed in this paper can be considered to be the nonlinear analog of OFDM. Both of 
the methods allow for a natural examination of their respective channel capacities. There are two main 
differences between the two methods. The first is that in linear channels the noise projection on different 
modes (spectral bands) is orthogonal while in the nonlinear case the noise projection on different modes 
(solitons) is orthogonal only in some cases (see Figure ). The second is that OFDM is very efficient in 
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terms of complexity (through the use of the celebrated FFT and IFFT) while the direct scattering is a 
computationally intensive method. 




Fig. 5. Variance gain (through Monte-Carlo simulations) due to proximity of eigenfunctions. The eigenvalues are 2 and 1. The first 
eigenfunction has t = while the second one's position is changed. 

1) Consider the general information theoretic problem of "communicating with colorless balls". Unlike 
Nyquist samples in a normal channels, co-located solitons cannot be differentiated. This means that 
mutual information is between two sets of values and not two sequences. For more on this issue 
see the work by Meron et al. [12]. 

2) Use the approach discussed in the paper with more complex potentials/waveforms (not reflectionless) 
to lower and upper bound the overall capacity (and not just achievable rates). 

3) Find reasonable complexity (preferably analog) methods to carry out the tasks of inverse and 
especially direct scattering in the transmitter and receiver. 

Appendix 



% = ^h^ R*K - mln dT (20) 
dZ a„ J 

^ = i^ /(^c - jns)i^fn - i^c + ns)i)ln dT (21) 
dZ a„ J 
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Abstract 

We analyze the achievable communication rates of a generalized soliton-based transmission system for the 
optical fiber channel. This method is based on modulation of parameters of the scattering domain, via the inverse 
scattering transform, by the information bits. The decoder uses the direct spectral transform to estimate these 
parameters and decode the information message. Unlike ordinary On-Off Keying (OOK) soliton systems, the 
solitons' ampUtude may take values in a continuous interval. A considerable rate gain is shown in the case where 
the waveforms are 2-bound soliton states. Using traditional information theory and inverse scattering perturbation 
theory, we analyze the influence of the amphtude fluctuations as well as sohton arrival time jitter, on the achievable 
rates. Using this approach we show that the time of arrival jitter (Gordon-Haus) limits the information rate in a 
continuous manner, as opposed to a strict threshold in OOK systems. 



I. Introduction 

Communication through optical fiber channels has evolved enormously in the past couple of decades 
leading to unprecedented information rates. Current information theoretic techniques are unsuccessful in 
producing relevant methods to predict capacity bounds for these channels. 

The nonlinear terms that affect signal evolution led to the following question: Is the information capacity 
of the optical fiber channel monotonically increasing with the input power and if so does the capacity 
grow logarithmically with power as it does for linear channels?. Moreover, as the complexity allowed 
in receivers grows, one looks for insights regarding the best (not necessarily the simplest) modulation 
schemes, signal space and error correcting codes. 

The basic generic partial differential equation (PDE) that describes the value of the electric field in 
space and time (in one dimension) in the optical fiber channel is (using normalized coordinates and the 
notations of [1]): 

• 1 <9^g ^ I l2 n n\ 

^^ + 2ar^ + kl^ = o (1) 

where the input of the channel is g(0,i) and the output is q{L,t). This equation is also known as the 
non-linear scalar Schrodinger (NLS) equation. 

Since the equivalent channel is nonlinear, a Fourier frequency based analysis is not applicable. The 
usual way to analyze a continuous-time channel in traditional information theoretic methods is to reduce 
the problem into a discrete one by considering the Nyquist samples of the input and output. However, 
since a bandlimited input signal evolves into an output signal of an infinite bandwidth, it is hard to 
find such discrete-time models. We stress that the nonlinearity invoked by the channel is fundamental 
and is conceptually different than nonlinearities caused by transmitter/reciever elements, e.g., amplifier 
nonlinearities, that have been studied in the past. 

A different approach to analyzing signal evolution in nonlinear channels is the inverse scattering 
transform (1ST). In this paper we present this method and apply it to a few tractable problems in which 
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we approximate the achievable data rates. We also explain how this method should be developed to 
characterize the channel capacity and useful modulation schemes. A similar approach, first proposed by 
Hasegawa and Nyu ([2]), suggested using multiple solitonic waveforms. It should be noted that the 1ST 
approach presented in this paper is not complete in the following aspects: 

• It does not provide single letter results for capacity but rather a new method to evaluate it which we 
feel is more esthetic and better suited for this channel. 

• It does not solve the problems associated with the bounded symbol rate for solitonic waveforms 
which is characterized by the Gordon-Haus bound ([3]). 

• It lacks a simple representation of the manner in which white noise is projected onto complex solitonic 
waveforms. 

We now give a short introduction to the inverse scattering transform which solves a set of nonlinear 
evolution problems via the solution of three linear problems. A recent more complete introduction to the 
1ST and its properties can be found in [4]. 

II. A PRIMER ON THE INVERSE SCATTERING TRANSFORM 

The inverse scattering method does not consist of a single generic transform. In fact, it is more like a 
recipe for solving a family of nonlinear evolution problems. This recipe involves finding two g-dependent 
operators, L and M, that obey certain conditions. The first operator of the two defines an eigenvalue 
problem for an auxiliary wave function. This problem gives rise to solutions that obey boundary conditions 
at — oo and oo. The way these solutions evolve from — oo to oo defines the scattering coefficients or the 
scattering data which is analogous to spectral content in the Fourier frequency domain for linear channel 
problems. Extracting the scattering data from the q dependent operator is called the direct transform. Due 
to special properties of the above operators the evolution of the scattering data in time is rather simple. 
Moreover, there is a well defined inverse transform that maps the scattering data back to q. All of the 
above steps, direct transform, inverse transform and time evolution are essentially linear problems. We 
now present the details of the 1ST for NLS. 

To solve integrable systems such as the NLS one needs to express the system as a compatibility condition 
of two linear equations for a wave equation, \E'(T, Z; C): 

L{Z)^> = (2) 

where L and M are differential operators in the T-derivatives and are called a Lax pair if: 

^^ML-LM=[M,L]. (4) 

The right hand side is called the commutator of M and L. If (4) holds then one can show that the 
eigenvalues of the operator L are Z-invariant: 

dC/dZ = 0, 

even though L is not Z-invariant. 

Finding a Lax pair for a given channel is not an obvious task. The Lax pair for the NLS, found by 
Zacharov and Shabat, is given by: 

^ - ( if. -4 ) 



9^ _lM^|2 n 9 |1„ 
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It is readily verified that for these operators, equation (2) results in the NLS equation. To solve equation 
2 we define vector wave functions for real ^ = C with asymptotic boundary conditions: 

$(T;0 ^ J )^"^^ T^-oo (7) 

^ T^oo. (8) 

The pair ^, ^ = {^2; is a complete system of solutions for (2). Therefore: 

$(r,0 = a(0* + 6(0*- (9) 

For T — )■ oo we have: 

m ^ a(0 ( J ) e-^«^ + 6(0 ( J ) e^'^. (10) 

Comparing with equation (7) we recognize l/a{^) and b{^)/a{^) as the transmission and reflection 
coefficients which characterize the scattering data. The origin of these names is in the fact that they 
describe what happens to a wave as it evolves from — oo to oo and scatters due to a certain "potential", 
q (these terms are borrowed from quantum physics). 
The discrete eigenvalues of the direct scattering problem are the set of points: 

C = {Cn,n = l,2,...iV;Im(C) >0 s.t. a(C) = 0} (11) 

for which: 

$(r;Cn) = 6n*(r;Cn). (12) 

Equation (12) shows that both ^ and $ approach zero as T approaches infinity. The scattering data, which 
has a one-to-one correspondence with q and hence carries the same information is comprised of: 

Eiz = 0) = [r(e; 0) = for real {Cn, C^n(O)} for n=l,2,,,N], (13) 

where: 

Cn{0)^bn{0)/a^{0) al(0) = ^(r = 0;Cn) (14) 

are called the norming constants of the bound states. 

The time evolution of the scattering data is governed by (3). The solution of which (see [1]) is: 

rii-Z) = r(e;0)e-^2^'^ (15) 
C^{Cn,Z) = C„(Cn;0)e-'2^"^ (16) 

UZ) = Cn(0). (17) 

The inverse problem of finding q given the scattering data is solved by a set of linear integral equations 
which are beyond the scope of this introduction. 

The 1ST is important because it allows the use of linear techniques to solve initial value problems for 
nonlinear problems. The main advantages of the 1ST is that the number of degrees of freedom that a 
signal is comprised of, i.e. number of solitons and radiation bandwidth, does not change through signal 
evolution and that there are natural invariant-over-time scalar entities, i.e. eigenvalues. The evolution of 
the solution in time is most naturally described through the 1ST and thus the 1ST may lead us to insights 
regarding communication strategies. For an in-depth survey of the 1ST also known as the nonlinear Fourier 
transform, and an OFDM-like communication transmission method, see the paper by Yousefi et al. ([4], 
[5]). 
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Actually, Hasegawa and Nyu (see [2], [l])proposed a communication method that utilizes the fact that 
the eigenvalues associated with the 1ST do not change in time. The advantages of the method proposed by 
Hasegawa et al. is that it is inherently multi-valued and is similar to frequency based methods for linear 
channels. The authors do not analyze the effects of amplifier noise on the eigenvalues and its implications 
on channel capacity. In the following we elaborate on the ideas of eigenvalue communications, extend it, 
and use results from perturbation theory (see for example [6], [7]) for nonlinear models to estimate the 
capacity of nonlinear channels. We extend the idea of eigenvalue communication to that of spectral data 
modulation and use the inverse scattering transform as our transmitter and the direct spectral transform 
in the receiver. We quantify the effects of amplitude fluctuations and jitter on achievable communication 
rates and evaluate them for realistic configurations. 

III. Carrying information using the scattering data 
We assume that the channel model is represented by: 

dq 1 d'^q , „ 

'al + 2az^ + l«l' = '« (18) 

where eR is the perturbation term. Throughout this paper we assume that is a white noise Gaussian 
process (in space and time) with a unit power spectral density (PSD) and e is used as a scaling parameter 
for the noise power that can be related to the physical parameters of the channel. We will later plug-in 
these parameters to obtain practical results. The noise is generated by the effects of amplifiers that are 
spread throughout the fiber but we assume it is injected adiabatically ^ . 

The information rate, R},, that can be achieved on this channel is upper bounded by the channel capacity 
which is the maximal mutual information between the channel's input and output :[8] 

Rb<maxI{q{0,T);q{L,T)). (19) 

where the maximization is taken over some input constraint (e.g. an average power constraint, a peak 
power constraint, Fourier bandwidth or maximal number of solitons). Evaluating the quantity above turns 
out to be a very difficult task for nonlinear channels. In this paper we argue that the most tractable way of 
evaluating this quantity is through the statistics of the scattering data of the 1ST, namely the eigenvalues 
and the absolute value of the norming constants. 

Since the 1ST is a one-to-one transformation the mutual information between the waveforms is equivalent 
to the mutual information between the scattering data, i.e., 

/(g(0, T); q{L, T)) = I{E{Z = 0); E{Z = L)). (20) 

To lower bound this quantity one can assume that the input is a reflectionless potential so that the 
information transmitted solely through the discrete eigenvalues and corresponding norming constants, i.e., 

7(E(Z = 0); E(Z = L)) > /({Cn(0), C„(0)}; {Cn(L), C„(L)}) 
for n = 1, 2, , , oo, 

where the time index is added since the Gaussian noise changes the eigenvalues (that are otherwise 
constant) and can also possibly change their number via the birth/death of a soliton. 

The observation that the mutual information in a nonlinear integrable channel can and should be 
evaluated through the statistics of the scattering data is the main observation in this paper. This approach is 
motivated by several reasons. First, unlike the linear spectral domain (i.e., Fourier methods where spectral 
broadening is a result of the nonlinearity) the number of degrees of freedom in the scattering domain 
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remains unchanged throughout the noiseless evolution. Second, the eigenvalues and norming constants 
serve as scalar candidates for the transmission of information implying a new notion of a nonlinear signal 
space. The evaluation of equation (21) is still a cumbersome task, yet it can be approximated assuming 
some further restrictions on the input signals. 

IV. Main Results 

In the generalized soliton transmission system we analyze, a codeword is a (large) set of symbols. 
Each symbol is in fact a set of eigenvalues and norming constants. At the transmitter, the waveform to 
be transmitted is generated using the inverse scattering transform. At the receiver, direct scattering is 
applied to derive the set of (perturbed) eigenvalues and norming constants. The waveforms used by the 
transmitter have infinite support but decay exponentially so that if we truncate the waveforms to create a 
finite symbol period at a suitable distance we can treat the resulting soliton interaction as being negligible 
to the added noise. 

Throughout this Section the imaginary parts of the eigenvalues, which can be considered to be gener- 
alized amplitudes, will be the information carrying agents. 
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Fig. 1. The IST-based communication scheme 



A. Information embedded in a single soliton 

In this setting single solitons are modulated. Unlike ordinary OOK their amplitudes belong to a 
continuous interval. Without a perturbation, the single soliton solution for the NLS is 

(22 \ 
-iKT + Z + iao] , (21) 

for which the corresponding discrete eigenvalue of the 1ST is ( = {k + iri)/2. For the rest of the paper 
we assume all eigenvalues are purely imaginary (except for perturbations). The localization of the soliton 
is around ITqI = e^^^^". 

We use results from [1] for the first order perturbations of the eigenvalues. The resulting fluctuation in 
the amplitude is: 

^ = e / 3fJ(i?exp-^'^)sechTciT (22) 

where T = r]{T -Tq). 
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Assuming i? is a bandlimited white Gaussian noise, i.e. < R(t, z),R{r, w) >= 6{t — t)6{z — w), we 
get: 

EMO) - r]{Z)f] = e'r]Z (23) 

i.e., the variance of the additive noise is proportional to i] (unlike ordinary multiplicative noise for which 
the variance is proportional to i]"^). 

Thus, assuming information is transmitted in the amplitude of a single is soliton (T = 0) we have the 
following scalar channel: 

Y = 253(C(^)) = v{Z) =V + VvN (24) 

where is a Gaussian r.v. with zero mean and a variance of e^Z. We dismiss the probability that the 
soliton vanishes completely and allow for Y to be theoretically zero (or negative). This scenario can 
be prevented (with high probability) by using ^/t] >> e\/Z which in the limit of e going to zero has 
negligible effect on the capacity. We lower bound the mutual information for the case r] E [i]min, Vmax] 
with Ai] = r]max — Vrnin- It is assumcd that the noise is Gaussian and of the the largest possible variance: 



/ > h(Y)-h(Y\r]) (25) 

> h{r]) ~ h{Y\7]) (26) 

> h{ri) - -In 271 er]maxe'^Z (27) 

At? 

= log bits/soliton (28) 

^/^Ter]max(^'^Z 



where we use the uniform distribution as the input prior and bound (27) using the fact that Gaussian noise 
has the highest entropy for a given variance. We refer to this quantity as the "soliton spectral efficiency" 
which can be considered to be the NLS analog of spectral efficiency in conventional (linear) channels 
where it's measured in bits/Hertz. 

The capacity can also be directly evaluated using the Blahut-Arimoto algorithm ([9], [10]). Using this 
algorithm for the channel model Y = r] + ^JrjN restricted s.t. rj G [1,2] and E^N"^) = 0.1^ we get 
that the true capacity is 1.568 bits per channel use while our bounds reads 1.275 bits per channel use. 
The capacity achieving prior and the resultant Y distribution are plotted in Figures 2 and 3. Note that 
the capacity achieving prior has both atoms and a continuous distribution which is typical of interval 
constrained capacity problems ([11]). 
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B. Information embedded in a soliton train- below the Gordon-Haus rate 

The above result shows that the interval A?7 = rj^ax — Vmin should be as large as possible to allow for 
each soliton to convey as many bits as possible. In fact when one considers transmitting many solitons 
one after the other, there are other considerations which bound the optimal interval size from both sides, 
namely intersoliton interaction and arrival time jitter. 

We now consider the case where many solitons are modulated sequentially. The distance between 
neighboring solitons is a multiple of the width of the widest soliton, i.e., where C is chosen so that 

1'lmin 

the intersoliton interaction has a negligible (compared to that of the noise) effect on the eigenvalues. The 
distance between solitons is inversely proportional to the symbol rate and thus in an optimal system r]niin 
is bounded from below. 

Since we wish to assume a perfectly (or at least an almost perfectly) synchronized communication 

system, the typical arrival time jitter needs to be less than the distance between neighboring solitons. The 
time of arrival jitter is known to be directly connected to fluctuations of the real part of the soliton which 
is linearly related to the velocity of the soliton as can be seen from 21. The fluctuations of the real part 
of the eigenvalue are very similar to that of the imaginary part: 

E[{k{0) - K{Z)f] = ^ (29) 

Using ^ = we integrate to account for the arrival time jitter (neglecting terms that do not 

originate from the velocity change): 

E[{To{0) - To{Z)y] ^ (30) 

This is the known Gordon-Haus ([3]) phenomena that bounds the symbol rate of all regular soliton systems 
(including OOK). The worst-case arrival time jitter is proportional to rjmax- Thus, requiring a (almost) 
jitter free model, e.g., a out-of-synchronization probability of 10"^ bounds from above r]max- 

We wish to compare the gain (in terms of bits/second) of the continuous amplitude modulation scheme 
versus that of the OOK modulation. We assume that r]max is tuned by the Gordon-Haus bound requiring 
no-jitter and is shared by both the continuous system and the on-off reference system. The continuous 
system has a lower symbol rate which is -^^hil times smaller than that of the reference system^. However, 

Urn n f *' 



^Actually, one can also analyze the case where symbol widths are not constant and are proportional to 
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the continuous system conveys more bits than just one per soliton. Weighing both terms the continuous 
system has a bit rate which is 

MG = max ■ log —^^=^^= = max ■ log (31) 

Vmin rfmax TT e rimax ^'^ Z Vmin r]max <^eff 

times that of the reference system. We refer to this term as the "Modulation gain". If one would also 
consider the possibility that a symbol can also contain no soliton at all, and if r/mm >> ^J^max^^/Z so 
that the transfer probability between the continuous interval and the zero hypothesis would be less than 
10^3 than the modulation gain would approximately read: 

max max ■ [ iJb(p) + p ■ log — ^ ) = (32) 

max max ■ ( Hb{p) + p ■ log — —j , (33) 

where -^^(p) is the binary entropy of p (see union of channels in [12]). The modulation gain is plotted 
in Figure 4 for different values of crgjj. It is evident that as the effective SNR improves a larger rimin is 
better since it does not reduce the symbol rate. 




Fig. 4. Modulation gains as a function of r\mi-n, for different (Je//. 



C. Information embedded in a 2-bound soliton train- below the Gordon-Haus rate 

The system described above could be analyzed using the framework of perturbations to sech profiles 
without necessarily using the perturbation theory of the inverse scattering transform. However, considering 
more complicated symbols made up of more than one soliton the 1ST has major analytical and practical 
advantages. This is the case when the symbols are confined to be either a 2-soliton bound state or a single 
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soliton (or non). We now analyze the modulation gain of this more complicated system and address such 
issues as common jitter and whether the solitons should be concentric or partially spaced apart. 

The idea of transmitting a few concentric solitons is proposed in the paper by Hasegawa et al. However, 
a 2-bound soliton is effected by noise differently than each one of its components. We show that a 2-bound 
soliton solution has a larger jitter than its components. Therefore there is a tradeoff between the enlarged 
bit rate and a smaller symbol rate that is induced by a larger jitter. 

The basic symbol is now comprised of a 2-bound soliton. This means the transmitter solves the following 
reflectionless algebraic inverse scattering problem for N — 2 ([1]): 

fin = \/a^(T;Cn) ^ = 1,2 

Fl = (/zi, , , , /iAf) 
Mnm = e„ey(Cn-C) 

en = ^/C^exp{iCnT) E ^ {ei, , , , cn)* 

The norming constants are used to localize the different eigenfunctions. As a generalization of the 
single soliton case, we choose |fen(0)| = e^''"*"*^*^^ where in(0) is the generalized position of the rith 
eigenfunction. Actually, the eigenfunctions interact with one another and the resulting time waveform is 
not a superposition of 2 single soliton profiles. Nevertheless, their generalized position remains unchanged 
throughout the evolution (apart from noise influence) and can be recovered at the receiver. The generalized 
position evolution is given by (to the first order): 

and thus it behaves in the same way as the center of single soliton. However, the fluctuations of the 
eigenvalues of a 2-bound soliton, both imaginary and real parts are not orthogonal anymore. In fact they 
are highly correlated in the case of a small separation between generalized locations or in the case of very 
similar eigenvalues. Moreover, the variance of the fluctuations is generally magnified when the solitons 
"overlap". This effect makes modulating non-concentric solitons (or actually eigenfunctions) a sensible 
thing to do. We plot the variance of the eigenvalues as a function of the separation between the generalized 
positions in Figure 5. In this setting the detector sees two eigenvalues and two norming constants that 
translate to generalized positions. All of these scalar quantities are now perturbed by noise. Since the two 
eigenfunctions are assumed to be much closer to each other than to allow for neglecting the Gordon-Haus 
jitter, we must account for the way the jitter effects the capacity. 

In linear communication problems a non-negligible jitter in symbol arrival times can diminish the 
achievable rate to zero. This is due to the fact that in a linear channel the signal space is made up of 
translations of a limited number of base functions. Once there is a jitter, these functions are no longer 
orthogonal and one can not differentiate between neighboring symbols. 

However, in a nonlinear integrable system, solitons can be detected through the direct scattering 
transform even if they are one on top of the other. Actually, they can be detected but not differentiated, 
i.e., both will be apparent but the receiver will not know which of the two belongs to the original slot. 

To lower bound the achievable rate of the jitter effected system we assume that once the eigenfunctions 
are detected they are sorted according to time of arrival. This channel is equivalent to transmitting a 
couple of solitons (eigenvalues), adding noise and finally permuting them in the case the switched places. 
We note the perturbed eigenvalues before and after the possible permutation y" and correspondingly 



10 




Fig. 5. Variance gain (tlirougii Monte-Carlo simulations) due to proximity of eigenfunctions. The eigenvalues are 2 and 1. The first 
eigenfunction has t = while the second's position is changed. 



(n=2 for the 2-bound soli ton case). The permutation, which is a random variable, is noted by vr". The 
information theoretic loss (in bits) due to the jitter is bounded by: 

= h{Y^) - h{w'^) - + /i(w^riO 

< Mn",<K)-Mn"K) 
= ifK|rr,<) 

< ^W) 

For the two soliton case, the permutation R.V. is equivalent to a Bernoulli R.V. where the mix-up 
probability is equal to the probability that the order of the generalized positions is changed. Using the 
assumption that the eigenvalues will approximately fluctuate in the same way as if the solitons were apart 
(and this is not true when they walk-by each other) we can approximate this probability. For the set of 
generalized positions -1,1, this probability is equal to Pmix-up < Pr(AT > 1) where AT ~ A^(0, liZzr^^^). 
If this probability turns out to be Pmix-up = 0.1, which is conventionally thought to be prohibitively large, 
the rate loss is only //^(O.l) ~ 0.5 bits for the 2-soliton symbol and only 0.25 bits per soliton iHf,(p) is 
the Shannon binary entropy function). The main advantage is a major increase in the soliton rate, since 
there are two solitons per symbol. 

Assuming the spacing between solitons of the same symbol is about a/r^mm and that original distance 
between symbols was C/rjmin the soliton rate is increased by a factor of 2 ■ We approximate the 
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mix-up probability to be Pmix-up ~ Q ( ) • Thus for this setting the "modulation gain" compared to 

V ^jitter J 

a simple OOK system is approximately: 

max 2 ■ -7^— max ■ I Hb{p) + p log , == - Hb{pmix-up)/2 ) . (34) 

P C+anrmuTlmax \ ^TTer]maxe^Z J 

The modulation gain for a certain set of parameters is shown in Figure 6 . The gain compared to single 
soliton trains is roughly 2 for a wide set of parameters. 




Fig. 6. Modulation gains as a function of r]min for two soliton trains vs. single soliton trains. 



D. Approximating the Information embedded in a soliton train- slightly above Gordon-Haus rate 

The next natural generalization is to consider an N-bound solution that is made up a train of well- 
spaced (spacing relates to the value of the norming constants) eigenfunctions (we assume N to be large, 
i.e. ^5). The analysis of the former subsection is still a good approximation. The difference is that now the 
ambiguity in time of arrival is not bounded to a pair of solitons. Still, if the eigenfunctions are properly 
spaced the entropy of the order-of-arrival sequence, H{ti'^), is mainly to do with the probability that 
consecutive eigenfunctions will change their order of arrival. The information theoretic penalty on the bit 
rate due to this effect is: 
1 

2' 



-H{Prnix—up^ 1 2pYnix—upi Pmix—up) ~ Pmix—up ' ^ / Pmix—up- (35) 



Now, assume the spacing between solitons is approximately (much smaller than the one called for 



mm 



by the Gordon-Haus limit) and the total modulation gain in this setting is: 

max max ■ ( Hb{p) + p log _ p^^^_^^ . log l/p^i^.^p ) . (36) 

P vrmn r]max \ y/n er]maxe^Z 
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Again, there is no problem with trains of eigenfunctions with a typical mix-up (between consecutive 
eigenfunctions) probability of 0.1. Moreover now there is a clear tradeoff for eigenfunction spacing. The 
bigger the spacing the smaller the symbol rate. As the spacing becomes smaller the penalty due to jitter 
is larger and so a unique maximum exists. The main disadvantage compared to the previous subsection 
is that the processing now involves a more complicated channel code. The main advantage is a larger 
symbol rate. 

The analysis above neglects a few things: 

1) There is small coupling between amplitude and time-of -arrival fluctuations. A precise analysis should 
only yield a higher rate. 

2) When two solitons pass by each other, their perturbation statistics is changed. In many cases, their 
amplitude fluctuations grow and are now dependent. We ignore the growth in fluctuations since, 
assuming that solitons are not too crowded, the walk-off is time bounded and its effects are negligible. 
Furthermore, the dependency can only increase the rate, only 

3) We ignore the possibility that a soliton will die/be bom. This happens with a small probability and 
we assume that its effect on the achievable rates can also be bounded. 

V. Discussion and further work 

The notion of modulating the "natural" domain of the channel is not new to communication theory. In 
fact, the scheme discussed in this paper can be considered to be the nonlinear analog of OFDM. Both of 

the methods allow for a natural examination of their respective channel capacities. There are two main 
differences between the two methods. The first is that in linear channels the noise projection on different 
modes (spectral bands) is orthogonal while in the nonlinear case the noise projection on different modes 
(solitons) is orthogonal only in some cases (see Figure ). The second is that OFDM is very efficient in 
terms of complexity (through the use of the celebrated FFT and IFFT) while the direct scattering is a 
computationally intensive method. 
Future research directions include: 

1) Find reasonable complexity (preferably analog) methods to carry out the tasks of inverse and 
especially direct scattering in the transmitter and receiver. 

2) Use the approach discussed in the paper with more complex potentials/waveforms (not reflection- 
less) to lower and upper bound the overall capacity (and not just achievable rates). 

3) While the problems above are not related to information theory, there is a totally new and interesting 
information-theoretic problem that relates to communication via the scattered domain. When receiv- 
ing waveforms that are comprised of N-bound solitons or solitons that are co-centric due to jitter 
(and not thru the constructed modulation) one detects a set of scalar values that can be detected but 
not differentiated. Essentially, the transmitter and receiver communicate through the transmission 
of a set, not a sequence, of perturbed scalar values. Clearly, transmitting and receiving a 3-bound 
solitons conveys less information than a sequence of (ordered in time) three solitons. The question 
is how much less? We call this problem: communicating with colorless, but not massless, balls. 
For more on this issue see the work by Meron et al. [13]. 
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